Outline

  • Introduction

  • Materials and Methods

  • Results

  • Discussion

  • Conclusion

Introduction

  • Data set of Rural People from Bangladesh with or without T1-Diabetes

  • Contains 306 data points and 22 variables

  • Exploratory Data Analysis

  • Classify T1 Diabetes children and explore important variables for T1 diabetes

Materials and Methods

  • Obtain data set

  • Data Wrangling

  • EDA

  • Analysis and Modeling

  • RF model

  • Shiny App

  • Working collaboratively using RStudio Cloud and Github

Results: Data wrangling

# Load libraries ----------------------------------------------------------
library("tidyverse")

# Load data ---------------------------------------------------------------
my_data_clean <- read_tsv(file = "/cloud/project/data/02_my_data_clean.tsv")


# Wrangle data ------------------------------------------------------------
my_data_clean_aug <- my_data_clean %>%
  mutate(Dur_disease = str_extract(`Duration of disease`,"\\d+\\.?\\d*"),
  unit = str_replace(`Duration of disease`, Dur_disease,"")) %>%
  select(-`Duration of disease`)

Results: Data Wrangling

# Converting duration to days for every value
my_data_clean_aug <- my_data_clean_aug %>%
  mutate(Dur_disease = as.numeric(Dur_disease)) %>%
  mutate(Dur_disease = case_when(unit == "d" ~ Dur_disease,
                                 unit == "w" ~ Dur_disease * 7,
                                 unit == "m" ~ Dur_disease * 30,
                                 unit == "y" ~ Dur_disease * 365),
         Dur_disease = replace_na(Dur_disease, 0)) %>%
  # We do not need the unit column anymore
  select(-unit) %>%
  # Separating "Other diease" column into three
  separate(`Other diease`,
           into = c("first_disease",
                    "second_disease",
                    "third_disease"),
           sep = ",")
## Warning: Expected 3 pieces. Missing pieces filled with `NA` in 305 rows [1, 2,
## 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].
## Results: Data Visualisation

Results: EDA

Fig A

Fig A

Results: EDA

A caption

A caption

Results: Analysis and Modeling

A caption

A caption

Results: Analysis and Modeling

Data is well seperated so classification seems to be feasible.

Data is well seperated so classification seems to be feasible.

Discussion

Conclusion